AI Plagiarism

Fake News

A 2017 paper, 3HAN: A Deep Neural Network for Fake News Detection claims to detect fake news through some kind of word vector that breaks apart the text. It’s unclear whether it actually flags news it has determined as false, or whether it simply relies on sensationalism clues. #todo Look for more recent work, perhaps by searching for papers that cite this one.

I suppose in theory you could try to detect how different one news item is from the “consensus”.

AI-Generated Content

In early April 2023, I noticed Amazon lists a dozen self-published books by “Lorraine Henwood” uploaded in the past week, including a workbook for “Age of Scientific Wellness”. By May the “workbooks” had been removed except for one workbook about “The Wisdom of Morrie”.

Plagiarism Detection

OpenAI concludes AI writing detectors don’t work

In a section of the FAQ titled “Do AI detectors work?”, OpenAI writes, “In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.”

The NYTimes challenges a bunch of AI Image detectors against real and MidJourney-generated images How Easy Is It to Fool A.I.-Detection Tools?, concluding it’s very hard to tell the difference, especially if an image has been resized or otherwise altered. Conclusion: rely on watermarks.

Alberto Romera at The Algorithmic Bridge:

(If you want to read a more in-depth analysis of how exactly detectors work and fail, I recommend you to check this overview by AI researcher Sebastian Raschka where he reviews the four main types of detectors and explains how they differ. For a hands-on assessment, I loved this article by Benj Edwards on Ars Technica.)


gptzero.me

see https://gptzero.substack.com/

currently the app uses a few properties, perplexity (randomness of a text to a model, or how well a language model likes a text); and burstiness (machine written text exhibits more uniform and constant perplexity over time, while human written text varies hosted on Streamlit


Plagiarism Prevention

C2PA

An open technical standard from Adobe/Microsoft/etc providing publishers, creators, and consumers the ability to trace the origin of different types of media.

OpenAI announced support, plus additional APIs and an upcoming Media Managerthat will make it possible for content creators to opt out.

To drive adoption and understanding of provenance standards - including C2PA - we are joining Microsoft in launching a societal resilience fund(opens in a new window). This $2 million fund will support AI education and understanding, including through organizations like Older Adults Technology Services from AARP(opens in a new window)International IDEA(opens in a new window), and Partnership on AI(opens in a new window).

Adobe calls it “content credentials” and it works by encoding provenance information through a set of hashes that cryptographically bind to each pixel

But Fast Company thinks It may just confuse things more

the smallest adjustments made to real images can be flagged as more questionable than completely made up pictures.

Photoguard developed at MIT (VentureBeat)

Photoguard does subtle pixel manipulation to fool common diffusion models

Humanize AI Output

BypassGPT claims to adjust any text snippet to make it pass AI plagiarism detectors including CopyLeaks.

Sample BypassGPT Undetectable AI rewrite

My experience

posted

This guy took my DeSci piece, rewrote it slightly, and posted it yesterday to his LinkedIn account of 5000 followers:

https://www.linkedin.com/feed/update/urn:li:activity:6998831186554859520/

500 reactions already

Every paragraph is just a rewrite. I wrote this:

In a DeSci world, the indelible nature of the blockchain closes off many sources of outright fraud. Smart contracts, by eliminating humans from the loop, can’t be bribed or intimidated, for example.

He writes this:

The indelible nature of the blockchain eliminates several sources of blatant #fraud in a #DeSci society. Smart contracts, by removing humans from the loop, cannot be bribed or intimidated.

The whole piece is like this!

2022-12-01 4:33 PM

plagiarism